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Some individuals with non-small-cell lung cancer (NSCLC) benefit from therapies targeting epidermal 
growth factor receptor (EGFR), and the characterization of a new mechanism of resistance to the 
EGFR- specific antibody gefitinib will provide valuable insight into how therapeutic strategies might be 
designed to overcome this particular resistance mechanism. The G719S and T790M mutations and their 
combination were involved in causing different conformational redistribution of EGFR. In the present 
computational study, we analyzed the impact and structural influence of G719S/T790M double mutation 
(DM) in EGFR with ligand (gefitinib) through molecular dynamic simulation (50 ns) and docking analysis. 
We observed the escalation in distance between the functional loop and activation loop with respect to 
T790M mutation compared to the G719S mutation. Furthermore, we confirmed that the G719S mutation 
causes the ligand to move closer to the hinge region, whereas T790M makes the ligand escape from the 
binding pocket. Obtained results provide with an explanation for the resistance induced by T790M and a 
vital clue for the design of drugs to combat gefitinib resistance. 

Human EGFR is one of the most studied members of the receptor tyrosine kinase (RTK) family owing to its 
vital role in the signal transduction pathways that regulate key cellular functions and its importance as a 
drug target \ Multi- domain protein EGFR consists of a single transmembrane domain, extracellular 
domain, and intracellular tyrosine kinase (TK) domain. As shown in Fig. 1, EGFR kinase domain consists of 
an N-terminal lobe (N-lobe), C-terminal lobe (C-lobe), and a hinge region connecting the two lobes. Residue 
T790 is in the hinge region, whereas residue G719 is in the P loop region that comprises part of the ATP -binding 
pocket. The ATP-binding pocket consists of a hinge region, p-loop, C helix, and activation loop. Threonine 
residue at 790^^ position is known as a gatekeeper, which controls the access of the inhibitors to a deep hydro- 
phobic pocket in the ATP-binding site. Activation of the receptor with growth factors or other cognate ligands 
induces receptor dimerization and the auto -phosphorylation of key tyrosine residues within the carboxyl ter- 
minal portion of the receptor. These phosphorylated tyrosine residues serve as active sites for various signal 
transducers, which initiate multiple signaling pathways, including those resulting in cancer phenotypes^. The 
aberrant activation of EGFR has been implicated in several key aspects of human neoplasia, including the 
increased proliferation, survival, and invasiveness of cancer cells. Recent studies reported the association of 
mutations in TK domain of EGFR with NSCLC patients^'^. Cells bearing mutant EGFR proteins show oncogenic 
properties but typically also exhibit enhanced sensitivity toward inhibitors than the wild-type (WT) EGFR 
protein. 

Gefitinib, the most common TK inhibitor (TKI), blocks signal transduction pathways implicated in cancers^. 
NSCLC patients who initially respond to TKIs but eventually results in acquired drug resistance by the initiation 
of secondary mutation T790M^'^. Mutation of the gatekeeper residue threonine at position 790 was first thought 
to reduce the affinity of the protein to the drug by creating steric hindrance in the binding site^. However, Yun 
et al. (2008)^ showed that both the single T790M mutant and the double-mutant L858R/T790M maintain the 
same low nanomolar affinity for gefitinib as the L858R mutant. By contrast, the T790M mutation confers a higher 
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Figure 1 | Schematic representation (ribbon shape) of crystal structure of EGFR kinase domain bound to gefitinib by PyMol. Stick representation of 
gefitinib according to the atomic color scheme (C in green, O in red and N in blue). Structural elements N-lobe (grey, red and cyan), C-lobe (White), 
hinge region (residues 788-797-Violet), P loop (residues 712-731-Red), C helix (residues 752-767- green) and activation loop (855-877, in blue). 



affinity toward ATP than the L858R mutant such that the combined 
double mutant L858R/T790M results in an activated enzyme that is 
resistant to ATP -competitive TKIs^. 

Recent report by Yoshikawa, S. et al. (2013) demonstrated the 
acquired resistant of double mutant G719S/T790M (DM) to gefiti- 
nib^, G719S mutation occurs within the phosphate-binding loop (P- 
loop) and not observed frequently^°. The structure of the EGFR DM 
(G719S/T790M) was solved and deposited in PDB^ Although the 
biological effects of the important mutations in EGFR at molecular 
level is clear, a mechanistic explanation linking the mutation to 
change in the explicit dynamic properties remains unclear. Thanks 
to the advances in force fields and the use of speciahzed computer 
architectures^^ or enhanced sampling methods^^ it is now possible to 
use all- atom molecular dynamic (MD) simulations accurately to 
portray the complex conformational transitions involved in drug 
resistances^. Therefore, to elucidate the structural and dynamic con- 
sequences of the DM on the catalytic domain of EGFR and its affinity 
toward gefitinib, we performed molecular dynamic simulations 
(50 ns) of the wild type (WT)-EGFR and three oncogenic mutants: 
G719S, T790M, and DM in complex with gefitinib. 

Results 

To examine the structural basis of the acquired drug resistance, we 
analyzed the structural dynamics and energetic effects of single and 
double EGFR mutations (G719S/T790M). 

Molecular modeling. The 3D structure of EGFR mutant model 
G719S and T790M was predicted computationally by Rosetta- 
Backrub server. Among the top ten models built by the server, the 
best model was identified using the confidence score of the structure 
modeling, which estimates of the quality of the predicted models 
(Supplementary Table 1). Modeled structure was further validated 
using the SAVES server (Supplementary Table 2). The validation 
statistics showed a good stereo -chemical quality with more than 
96% residues in the core region. The final protein model was 



subjected to MD simulation (50 ns) via Gromacs to energy mini- 
mize and stabilize the protein. 

Flexibility and compactness in WT and mutant EGFRs. The time 
evolutions of RMSD of the protein backbone atoms for the simu- 
lations with WT and mutant EGFR were analyzed (Supplementary 
Figure 1). For each case, the energy of the minimized starting 
structure was taken as a reference. To bring back all- atoms level of 
detail, two individual MD simulations for a time scale of 50 ns was 
initiated to the WT and mutant structures G719S, T790M and DM. 
For analyzing the degree of convergence and consistency of the 
system, we performed two individual MD simulations for 50 ns. 
No significant drift was observed in the amino acid trajectories 
initiated from the repeated MD of EGFR structures. Both the 
simulated protein structures were aligned with root-mean- square 
deviation (RMSD) for backbone atoms below 3.5 A (Supplement- 
ary Figure 1). 

After —10 000 ps, mutant T790M showed a different deviation 
pattern until the end of the simulation resulting in backbone RMSD 
of -0.29 to 0.45 nm, whereas mutant G719S and DM (Fig. SI) did 
not stay too far from the WT protein toward the end of the simu- 
lation period. This magnitude of fluctuation, together with a small 
difference in the average RMSD value leads to the conclusion that the 
simulation produced a stable trajectory, thus providing a suitable 
basis for further analysis. 

Comparison of RMSF values between WT and DM. The RMSF 
values of the C- alpha atoms for each residue were computed for 
WT and DM to understand the residue-wise mobility of the two 
proteins (Fig. 2). We observed that the DM tended to show fewer 
fluctuations than each of the single point mutations and the WT. DM 
caused a decrease in mobility, specifically in alpha helix 2 and also 
around the central part of the protein. The functional importance of 
the region was explored via the docking of gefitinib to EGFR, 
demonstrating the active participation of the region in protein- 
inhibitor complex formation. Thus, a decrease in the mobility of 
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this region might be responsible for the alteration in the functional 
activity of the mutant protein. It may be emphasized that an increase 
in rigidity and reduction in overall flexibility was observed upon 
mutation, effects that mutation might impact the binding pro- 
perties of the protein. 

To analyze the effect of mutation on the collective movement of 
the protein, we performed a fluctuation analysis for the average 
50,000 ps structures of WT and all the mutant structures, and char- 
acterized them with respect to the P-loop (residues 712-731) and 
hinge regions (residues 788-797). In the P-loop region of EGFR, 
reduced flexibility was observed with both the DM and G719S muta- 
tions, whereas T790M exhibited a higher fluctuation (Fig. 3), con- 
firming the reduced P-loop flexibility caused by the G719S mutation 
within this loop, as proposed by Yoshikawa et al (2013). Similarly, 



both the DM and G719S mutations exhibited a lower fluctuation 
than T790M in the hinge region. Thus, a reduction in the mobility 
of this region might be responsible for the observed alteration in the 
functional activity of the mutant protein (Fig. 4). 

Impact of mutations on secondary structural elements and the 
binding pocket. The time -dependent distance between the mass 
centers of each pair of the P-loop and activation loop was cal- 
culated to detect the relative movement induced by the mutations 
(Fig. 5). Our analysis revealed that the T790M mutation significantly 
increased the distance between the P-loop and activation loop, 
whereas the G719S mutation significantly shortened the distance 
between the P-loop and activation loop (Supplementary Figure 2). 
This result indicates that the secondary T790M mutation retains the 




Figure 3 | Comparison of the RMS fluctuation of the P-loop of WT, G719S, T790M, and DM over the 50 ns of the trajectory. Color scheme: WT: Black, 
G719S: Green, T790M: Red, DM: Violet. 
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active conformation of the binding site as proved by Yoshikawa et al. 
(2013). To further characterize the effect of the T790M mutation on 
the conformational distribution of structural elements, we calculated 
the time- dependent distance between the mass centers of gefitinib 
and the active site residue M793 for the WT and the mutant proteins. 
Based on our analysis, the docking mode of gefitinib with EGFR WT 
and mutant were in agreement with the recently reported low- 
resolution structure of the complex determined by X-ray scattering 
analysis (9), whereby gefitinib forms hydrogen bonds with active site 
residue M793 (Supplementary Table 3). As in Fig. 6, the histogram 
plot shows T790 mutant exhibited a higher average distance between 
M793 and the drug, though the G719 mutation caused gefitinib to 
move closer to the binding site. In DM, the distance between the drug 
and M793 was shorter with respect to WT because of the secondary 
mutation (Supplementary Figure 3). This result proves that the 
T790M secondary mutation effectively restores the nucleotide 
binding property of the G719S mutant as observed for the L858R 
mutant^. 



To identify the specific structural change in the binding pocket 
that resulted in the observed ligand movements, we calculated 
time-dependent distances among the EGFR pharmacophore resi- 
dues in the hydrophobic region (L718 and G796). Our analysis 
indicated that the T790M mutation resulted in a shorter distance 
between residue L718 and G796 (Fig. 7). By contrast, the G719S 
mutation increased the distance between residues L718 and G796. 
For DM, distance was found to be lowest when compared to WT 
and the other mutations. The decreased distance between L718 and 
G796 lead to a smaller slot in the hydrophobic region, which in 
turn facilitated the exclusion of gefitinib from the binding pocket. 
We also compared the gefitinib -binding modes of the WT and DM 
structures. The main hydrogen bonding between EGFR (Met 793) 
and gefitinib is common for the WT and mutant models; however, 
the aniline ring of gefitinib was shifted upward in DM when com- 
pared to WT EGFR. This shift is presumably an adaptation by 
gefitinib to adjust to the modification caused by EGFR DM 
(Supplementary Figure 4). 
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The dictionary of secondary structure of protein (DSSP) program 
was applied to the secondary structure of the EGFR WT and mutant 
models, and the resulting fluctuations were illustrated in 
Supplementary Figure 5 (A-D). In the G719S mutation, minimal 
changes were observed in the coil region that remains near to the 
point mutant region. Specifically, the residues ranging from 710 to 
720 showed conformational changes from coil to bend, and toward 
the C lobe conformational changes from turns to coils began to 
dominate. In the case of DM, most of the alterations affected the 
hinge region, the activation loop, and c lobe secondary structure 
regions, with the helical elements replaced with turns and bends 
during the course of the simulation. During the simulation, the native 
structure retained higher percentages of the native secondary struc- 
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Figure 7 | Time-dependent distance between the mass centers of residues 
L718 and G796 for WT, G719S, T790M, and DM over the 50 ns of the 
trajectory. Color scheme: WT: Black, G719S: Green, T790M: Red, DM: 
Violet. 



tural element conformations compared to that of the mutant 
structures. 

Principal component analysis. PCA was performed on all the four 
trajectories of EGFR WT and mutant forms to monitor the overall 
strenuous motions of the protein. Diagonal co variance matrices were 
built over the Ca atoms of the protein for each trajectory and used to 
capture the degree of gefitinib co-linearity in the atomic positions for 
324 residues within the EFGR structure for each pair of atoms. The 
eigenvalues obtained through the diagonalization of the covariance 
matrix elucidates the atomic contribution on motion. Similarly, the 
eigenvectors explain a collective motion accomplished by the 
particles (van der Spoel et al., 2005). A total of 580 eigenvectors 
was generated for the entire trajectory indicating that the first five 
eigenvectors accounted for more than 90% of the overall system 
motion for native trajectory. The overall motion of a double 
mutation for the top 7 eigenvectors accounted for 85%. Within the 
top eigenvectors, the first two accounted for a significant amount of 
overall motion in each case. The projection of first two principal 
components displays the motion of the native and mutant forms in 
phase space. Here, the overall flexibility was calculated by the trace of 
diagonalized covariance matrix. The trace values for WT, G719S, 
T790M and DM structure of EGFR was found to be 26.234 nm^ 
19.671 nm^ 32.789 nm^ and 12.018 nm' respectively (Fig 8A-D). 
Among these values, T790M showed high values suggesting an 
overall escalation in the flexibility than the native model, whereas 
DM exhibited lowest value confirming the decrease in flexibility in 
the collective motion of the protein. From these projections, it was 
observed that clusters of DM were well defined and was more stable 
compared to the other protein model. The DM form covered a 
smaller region of conformational space than the WT and other 
mutant forms as shown in Figure 8. 

Discussion 

The major drawback of TKI therapy is the development of secondary 
resistance caused by the acquisition of new mutations, as best exem- 
plified by imatinib -resistant mutations in BCR-ABL-positive CML^^. 
To our knowledge, this mechanism of drug resistance, i.e., resistance 
conferred by a mutation that increases the affinity for a competing 
physiologic substrate, has not been previously reported within a 
clinical context. Interestingly, distinct but related effect has recently 
been described in a mutant of the mitotic kinesin in KSP in which 
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drug resistance is conferred by an allosteric mechanism involving an 
enhanced affinity for ATP^^. 

In Hght of the present study, we can rationaUze and quantify the 
epistatic effect due to the occurrence of secondary mutations in 
EGFR. The development of altered drug resistance mechanism with 
the EGFR double mutation is due to a change in the active site 
conformation. As proposed in previous study^^, the total stabilization 
of the active state by DM is more than would be expected from a 
simple combination of stabilization due to the two single mutations. 
The EGFR double mutation gatekeeper residueT790M is situated at 
the top of the hydrophobic spine and stabilizes the active conforma- 
tion. In agreement with a recent study^, the gatekeeper T790M 
mutant does not appear to act via steric hindrance with inhibitors 
but rather by stabilizing the active conformation. In this case, this 
methionine participates in the hydrophobic core surrounding the 
active site. These results agree with the enhanced stabilization of 
the catalytic site observed when comparing the collective motions 
of the WT and mutant kinase domain^^. 

According to our analysis, we found that the combination of 
mutations (G719S/T790M DM) in EGFR has both the rigidifying 
effects of the two single mutations and also stabilizes the correct 
helical structure of the aC-helix. In particular, the T790M mutation 
decreased the size of the hydrophobic slot formed by L718 and G796 
in the ATP-binding pocket (Fig. 7), suggesting that the design of 
T790M mutant inhibitors should avoid targeting this region. We 
found that the importance of DM is not a simple addition of the 
individual mutations but rather that the secondary T790M mutation 



reversed the effect of G719S on the distance between the P-loop and 
activation loop. These EGFR mutants should therefore be considered 
as an invaluable tool to evaluate the activity of novel, potentially more 
potent, ATP -competitive inhibitors for NSCLC patients. 

Methods 

Structural modeling and docking study. For WT and DM, we retrieved already 
available structure from PDB [WT - PDB ID: 3VJO chain A, at 2.64 A and DM- PDB 
ID: 3UG2 chain A, at 2.50 A] for our analysis. In order to model G719S and T790M 
mutant proteins, RosettaBackrub web server was used which is based on ab initio 
modeling technique Rosetta 3.1^''. The RosettaBackrub server provides an easily 
accessible interface to Rosetta predictions and implements three applications that 
utilize the backrub-method for flexible protein-backbone modeling and design. The 
models with high scores and good topologies were selected as candidate structures. 

The Autodock 4.2 suite was used as a molecular-docking tool to perform the 
docking simulations. A Lamarckian genetic algorithm was used as a search para- 
meter^". The (Lamarckian GA parameters used in the study were as follows: number 
of runs, 30; population size, 150; the maximum number of evaluations, 25,000,000; 
number of generations, 27,000; rate of gene mutation, 0.02; and the rate of crossover, 
0.8. Docking was performed using grid sizes 60, 60, and 60 along the X, Y, and Z-axes, 
with 0.375 A spacing. The RMS cluster tolerance was set to 2.0 A. 

Molecular dynamic simulation. Classical molecular dynamic simulations of the 
EGFR receptor and in a ligand-bound state (with gefitinib) were performed using the 
GROMACS 4.5 package^\ The GROMOS43A1 force field'^ was adopted to analyze 
the ligand-bound dynamics; the ligand force fields were provided by the PRODRG 
program^^. The protein-ligand complex structure was solvated in a triclinic water box 
under periodic boundary conditions using a 1.0 nm minimum distance from the 
protein to the box faces and neutralizing the system using two CI" ions added to the 
solvent. The final systems consisted of approximately 25,000 atoms. Following the 
steepest descent minimization, the systems were equilibrated in the canonical 
ensemble (under NVT conditions for 500 ps at 300 K) and, subsequently, in the 
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isothermal-isobaric ensemble (under NPT conditions for 500 ps) by applying 
position restraints to the protein. Lastly, all the restraints were removed, and 50 ns 
molecular dynamic runs were performed twice under NPT conditions at 300 K. To 
maintain constant pressure (1 atm), (isotropic coordinates scaling), the Parrinello- 
Rahmanbarostat^^ was used with a relaxation time of 2.0 ps. Van der Waals 
interactions were modeled using 6-12 Lennard- Jones potentials, with a 1.4 nm cut- 
off. The long-range electrostatic interactions were calculated using the PME method, 
with a cut-off for the real space term of 0.9 nm. Covalent bonds were constrained 
using the LINGS algorithm. The time step employed was 2 fs, and the coordinates 
were saved every 2 ps for analysis, which was performed using standard GROMACS 
tools. 

Principal component analysis. The trajectory of an MD simulation was utilized to 
identify the motions of the native and mutant EGFR models. We used principal 
component analysis to extract the principal modes involved in the motion of the 
protein molecule^^. A covariance matrix was assembled using a simple linear 
transformation in Gartesian coordinate space. A vectorial depiction of every single 
component of the motion indicates the direction of motion. For this, a set of 
eigenvectors was derived through the diagonalization of the covariance matrix. Each 
eigenvector has a corresponding eigenvalue that describes the energetic contribution 
of each component to the motion^*^. The protein regions that are responsible for the 
most significant collective motions can be acknowledged through PGA. The 
GROMAGS inbuilt tool g_covar & g_anaeig was used to perform PGA. 

Molecular imaging & MD analysis. All the protein structural visualizations were 
performed using Pymol (DeLano 2002). The trajectories were analyzed using the 
integral tools in the GROMAGS distribution. A further secondary structure analysis 
was performed using the DSSP program^^. All the graphical displays were generated 
using the XMgrace program. 
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